Hierarchical Machine Translation With Discontinuous Phrases

نویسنده

  • Miriam Kaeshammer
چکیده

We present a hierarchical statistical machine translation system which supports discontinuous constituents. It is based on synchronous linear context-free rewriting systems (SLCFRS), an extension to synchronous context-free grammars in which synchronized non-terminals span k ≥ 1 continuous blocks on either side of the bitext. This extension beyond contextfreeness is motivated by certain complex alignment configurations that are beyond the alignment capacity of current translation models and their relatively frequent occurrence in hand-aligned data. Our experiments for translating from German to English demonstrate the feasibility of training and decoding with more expressive translation models such as SLCFRS and show a modest improvement over a context-free baseline.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Source-Side Discontinuous Phrases for Machine Translation: A Comparative Study on Phrase Extraction and Search

Standard phrase-based statistical machine translation systems generate translations based on an inventory of continuous bilingual phrases. In this work, we extend a phrase-based decoder with the ability to make use of phrases that are discontinuous in the source part. Our dynamic programming beam search algorithm supports separate pruning of coverage hypotheses per cardinality and of lexical hy...

متن کامل

Accurate Non-Hierarchical Phrase-Based Translation

A principal weakness of conventional (i.e., non-hierarchical) phrase-based statistical machine translation is that it can only exploit continuous phrases. In this paper, we extend phrase-based decoding to allow both source and target phrasal discontinuities, which provide better generalization on unseen data and yield significant improvements to a standard phrase-based system (Moses). More inte...

متن کامل

Effective Use of Discontinuous Phrases for Hierarchical Phrase-based Translation

Hierarchical phrase-based (HPB) models have shown strong capability in generalization and reordering. However, they are heavily dependent on continuous phrases and are difficult for modeling natural linguistic discontinuities directly. In this paper, we propose a novel approach for integrating discontinuous phrases into the Chinese-to-English HPB system. We focus on the extraction method of dis...

متن کامل

A Dictionary Lookup Strategy for Translating Discontinuous Phrases

Translation of discontinuous phrases is a major challenge in Machine Translation. Within METIS-II we developed a dictionary lookup strategy by mapping the items of a dictionary entry on non-adjacent words in an input text. Mapping is controlled through so-called contextual rejection, i.e. inappropriate mappings are discarded if they fail to satisfy a predefined set of constraints. We present va...

متن کامل

Analysing soft syntax features and heuristics for hierarchical phrase based machine translation

Similar to phrase-based machine translation, hierarchical systems produce a large proportion of phrases, most of which are supposedly junk and useless for the actual translation. For the hierarchical case, however, the amount of extracted rules is an order of magnitude bigger. In this paper, we investigate several soft constraints in the extraction of hierarchical phrases and whether these help...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015